-
Notifications
You must be signed in to change notification settings - Fork 16k
Ruby: Call "Class#new" over rb_class_new_instance in decoding #7352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby: Call "Class#new" over rb_class_new_instance in decoding #7352
Conversation
This patch has almost no change in behaviour where users have not
patched the implementation of new on either a specific proto object
class, or `Google::Protobuf::MessageExts::ClassMethods`. The default
implementation of `new`, and `rb_class_new_instance` have the same
behaviour.
By default when we call `new` on a class in Ruby, it goes to the `Class`
class's implementation:
```ruby
class Foo
end
>> Foo.method(:new).owner
=> Class
```
the `Class` implementation of `new` is (pseudocode, it's actually in c):
```ruby
class Class
def new(*args, &blk)
instance = alloc
instance.initialize(*args, &blk)
instance
end
end
```
`rb_class_new_instance` does the same thing, it calls down to
[`rb_class_s_new`](https://github.com/ruby/ruby/blob/v2_5_5/object.c#L2147),
which calls `rb_class_alloc`, then `rb_obj_call_init`.
`rb_funcall` is a variadic c function for calling a ruby method on an object,
it takes:
* A `VALUE` on to which the method should be called
* An `ID` which should be an ID of a method, usually created with `rb_intern`,
to get an ID from a string
* An integer, the number of arguments calling the method with,
* A number of `VALUE`s, to send to the method call.
`rb_funcall` is the same as calling a method directly in Ruby, and will perform
ancestor chain respecting method lookup.
This means that in C extensions, if nobody has defined the `new` method on any
classes or modules in a class's inheritance chain calling
`rb_class_new_instance` is the same as calling `rb_funcall(klass,
rb_intern("new"))`, *however* this removes the ability for users to define or
monkey patch their own constructors in to the objects created by the C
extension.
In Ads, we define [`new`](https://git.io/JvFC9) on
`Google::Protobuf::MessageExts::ClassMethods` to allow us to insert a
monkeypatch which makes it possible to assign primitive values to wrapper type
fields (e.g. Google::Protobuf::StringValue). The monkeypatch we apply works for
objects that we create for the user via the `new` method. Before this commit,
however, the patch does not work for the `decode` method, for the reasons
outlined above. Before this commit, protobuf uses `rb_class_new_instance`.
After this commit, we use `rb_funcall(klass, rb_intern("new"), 0);` to construct
protobuf objects during decoding. While I haven't measured it this will have
a very minor performance impact for decoding (`rb_funcall` will have to go to the
method cache, which `rb_class_new_instance` will not). This does however do
the "more rubyish" thing of respecting the protobuf object's inheritance chain
to construct them during decoding.
I have run both Ads and Cloud's test suites for Ruby libraries against this
patch, as well as the protobuf Ruby gem's test suite locally.
|
LGTM, thanks Penelope! |
|
Thanks for writing this up Penelope! I'm glad to know that this can't crash our C extension. I do have a few concerns about this change:
That said, I understand that this fixes a bug in Ads right now though, and we already do something like this in other places: If you can verify that the performance impacts of this won't adversely affect the Ads client, I'll merge it. |
|
Hi Josh, I benchmarked this vs current master, and decoding was the same speed (within the margin of error) |
This patch has almost no change in behaviour where users have not
patched the implementation of new on either a specific proto object
class, or
Google::Protobuf::MessageExts::ClassMethods. The defaultimplementation of
new, andrb_class_new_instancehave the samebehaviour.
By default when we call
newon a class in Ruby, it goes to theClassclass's implementation:
the
Classimplementation ofnewis (pseudocode, it's actually in c):rb_class_new_instancedoes the same thing, it calls down torb_class_s_new,which calls
rb_class_alloc, thenrb_obj_call_init.rb_funcallis a variadic c function for calling a ruby method on an object,it takes:
VALUEon to which the method should be calledIDwhich should be an ID of a method, usually created withrb_intern,to get an ID from a string
VALUEs, to send to the method call.rb_funcallis the same as calling a method directly in Ruby, and will performancestor chain respecting method lookup.
This means that in C extensions, if nobody has defined the
newmethod on anyclasses or modules in a class's inheritance chain calling
rb_class_new_instanceis the same as callingrb_funcall(klass, rb_intern("new")), however this removes the ability for users to define ormonkey patch their own constructors in to the objects created by the C
extension.
In Ads, we define
newonGoogle::Protobuf::MessageExts::ClassMethodsto allow us to insert amonkeypatch which makes it possible to assign primitive values to wrapper type
fields (e.g. Google::Protobuf::StringValue). The monkeypatch we apply works for
objects that we create for the user via the
newmethod. Before this commit,however, the patch does not work for the
decodemethod, for the reasonsoutlined above. Before this commit, protobuf uses
rb_class_new_instance.After this commit, we use
rb_funcall(klass, rb_intern("new"), 0);to constructprotobuf objects during decoding. While I haven't measured it this will have
a very minor performance impact for decoding (
rb_funcallwill have to go to themethod cache, which
rb_class_new_instancewill not). This does however dothe "more rubyish" thing of respecting the protobuf object's inheritance chain
to construct them during decoding. What this means practically is that the monkeypatch where we override new works for the ads library in decoding. Other users could override new on
Google::Protobuf::MessageExts::ClassMethodsto introduce their own behavioursI have run both Ads and Cloud's test suites for Ruby libraries against this
patch, as well as the protobuf Ruby gem's test suite locally. I also tested what happens if a user does not return the right object type from their overriden new method. A reasonable exception is produced:
TypeError: wrong argument type Object (expected Message), and no crash occurs in the C code.